[SPARK-16006][SQL] Attemping to write empty DataFrame with no fields throws non-intuitive exception#13730
[SPARK-16006][SQL] Attemping to write empty DataFrame with no fields throws non-intuitive exception#13730dongjoon-hyun wants to merge 2 commits intoapache:masterfrom dongjoon-hyun:SPARK-16006
Conversation
There was a problem hiding this comment.
Since validatePartitionColumn is used for only writing-related classes, I added this description for clarification.
We can update the function description if the usage pattern is changed in the future.
|
Hi, @tdas . |
|
Test build #60686 has finished for PR 13730 at commit
|
There was a problem hiding this comment.
Its weird that a check like this is in the PartitioningUtils. This check seems nothing to do with partitioning, as its basically certain file formats that do not support writing a DF with no columns. Is there somewhere earlier where you can check this?
There was a problem hiding this comment.
Thank you for review, @tdas.
Yes. Indeed. This is beyond of the scope of PartitioningUtils.
Actually, this logic is used 3 classes, PreWriteCheck., DataSource, FileStreamSinkWriter
I'll try to move this.
|
Hi, @tdas . (Anyway, I will update this PR for further discussion) |
|
Test build #60704 has finished for PR 13730 at commit
|
|
Hi, @tdas . |
|
Hi, @rxin . |
|
Oh, sorry. The master was changed. |
|
I will recheck this PR again. |
|
Yep. The case still exists for |
|
Hi, @tdas . |
|
Test build #61014 has finished for PR 13730 at commit
|
|
Hi, @tdas . |
|
Rebased. |
|
Test build #61183 has finished for PR 13730 at commit
|
|
Test build #61254 has finished for PR 13730 at commit
|
|
Ping @tdas |
…throw non-intuitive exception
|
Test build #61391 has started for PR 13730 at commit |
|
Retest this please. |
|
Test build #61397 has finished for PR 13730 at commit
|
|
Hi, @tdas . |
|
Thanks - merging in master/2.0. |
…throws non-intuitive exception
## What changes were proposed in this pull request?
This PR allows `emptyDataFrame.write` since the user didn't specify any partition columns.
**Before**
```scala
scala> spark.emptyDataFrame.write.parquet("/tmp/t1")
org.apache.spark.sql.AnalysisException: Cannot use all columns for partition columns;
scala> spark.emptyDataFrame.write.csv("/tmp/t1")
org.apache.spark.sql.AnalysisException: Cannot use all columns for partition columns;
```
After this PR, there occurs no exceptions and the created directory has only one file, `_SUCCESS`, as expected.
## How was this patch tested?
Pass the Jenkins tests including updated test cases.
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes #13730 from dongjoon-hyun/SPARK-16006.
(cherry picked from commit 9b1b3ae)
Signed-off-by: Reynold Xin <rxin@databricks.com>
|
Thank you for merging, @rxin . |
What changes were proposed in this pull request?
This PR allows
emptyDataFrame.writesince the user didn't specify any partition columns.Before
After this PR, there occurs no exceptions and the created directory has only one file,
_SUCCESS, as expected.How was this patch tested?
Pass the Jenkins tests including updated test cases.